Applying Name Entity Recognition to Informal Text

نویسندگان

  • Yu-shan Chang
  • Yun-Hsuan Sung
چکیده

Although Name Entity Recognition (NER) has been a well-studied problem in recent years, it is seldom applied to informal document, such as E-mail message and Newsgroup postings. Unlike the formal text, which is well-structured and with seldom error, the name entities are much more difficult to recognize in informal one. The key problems for informal text are that it has unstructured properties, more grammatical error, and more spelling error. All of these properties will degrade the performance of the existent classifiers and well-designed features which are suitable for original NER. In this project, we are going to apply two approaches, Maximum Entropy Classifier (MaxEnt ) and Conditional Random Field (CRF), which are often used for formal text NER, to informal text NER. We do some experiment to show if they are still good for the informal task. We also focus on how to extract efficient and effective features especially for informal text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text

There has been little prior work on Named Entity Recognition for ”informal” documents like email. We present two methods for improving performance of person name recognizers for email: emailspecific structural features and a recallenhancing method which exploits name repetition across multiple documents.

متن کامل

Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text

There has been little prior work on Named Entity Recognition for ”informal” documents like email. We present two methods for improving performance of person name recognizers for email: emailspecific structural features and a recallenhancing method which exploits name repetition across multiple documents.

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...

متن کامل

Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text

The problem of named entity recognition (NER) has been well-studied, but there has been little prior work on NER for “informal” documents—i.e., documents like email messages and bulletin board postings that are prepared quickly, and intended for a narrow audience. In this paper, we investigate NER for informal text, via an experimental study of recognizing personal names in email. We study the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005